Language Production: the Source of the Dictionary

نویسنده

  • David D. McDonald
چکیده

Ultimately in any natural language production system the largest amount of human effort will go into the construction of the dictionary: the data base that associates objects and relations in the program's domain with the words and phrases that could be used to describe them. This paper describes a technique for basing the dictionary directly on the semantic abstraction network used for the domain knowledge itself, taking advantage of the inheritance and specialization machanisms of a network formalism such as r,L-ON~ The technique creates eonsidcrable economies of scale, and makes possible the automatic description of individual objects according to their position in the semantic net. Furthermore, because the process of deciding what properties to use in an object's description is now given over to a common procedure, we can write general-purpose rules to, for example, avoid redundancy or grammatically awkward constructionS. Regardless of its design, every system for natural !anguage production begins by selecting objects and relations from the speaker's internal model of the world, and proceeds by choosing an English phrase to describe each selected item, combining them according to the properties of the phrases and the constraints of the language's grammar and rhetoric. TO do this, the system must have a data base of some sort, in which the objects it will talk about are somewhow associated with the appropriate word or phrase (or with procedures that will construct them). 1 will refer to such a data base as a dictionary. Evcry production system has a dictionary in one form or another, and its compilation is probably the single most tedious job that the human designer must perform. In the past. typically every object and relation has been given its own individual "lex" property with the literal phrase to be used; no attempt was made to share criteria or sub-phrases between properties; and there was a tacit a~umtion that the phrase would have the right form and content in any of the contexts that the object will be mentioned. (For a review of this literature, see r~a .) However, dictionaries built in this way become increasingly harder to maintain as programs become larger and their discourse more sophisticated. We would like instead some way to de the extention of the dictionary direcdy to the extention of the program's knowledge base; then, as the knowledge base expands the dictionary will expand with it with only a minimum of additional cffort. This paper describes a technique for adapting a semantic abstraction hierarchy of thc sort providcd by ~d~-ONE ~:1.] to function directly as a dictionary for my production system MUMIII.I~ [ , q ' ~ . . Its goal is largely expositional in the sense that while the technique is fully spocificd and proto-types have been run, many implementation questions remain to be explored and it is thus premature to prescnt it as a polished system for others to use; instead, this paper is intended as a presentation of the issues--potcntial economicw---that the technique is addressing. In particular, given the intimate relationship between the choice of architecture in the network formalism used and the ability uf the dictionary to incorporate linguistically useful generalizations and utilities, this presentation may suggest additional criteria for networ k design, namely to make it easier to talk about the objects the network The basic idea of "piggybacking" the dictionary onto the speaker's regular semantic net can be illustrated very simply: Consider the KL.ONE network in figure one, a fragment taken from a conceptual taxonomy for augmented transition nets (given in [klune]). The dictionary will provide the means to describe individual concepts (filled ellipses) on the basis of their links to generic concepts lempty ellipses) and their functional roles (squar~s), as shown there for the individual concept "C205". The default English description of C205 (i.e. "the jump arc fi'om S / N P to S /DCL") is created recursiveiy from dL.~riptions of the three network relations that C205 participates in: its "supercuneept" link to the concept "jump-are". and its two role-value relations: "source-stateIC205)=S/NP" and "nextstate(C205)=S/t:~Ct.". Intuitively. we want to associate each of the network objects with an English phrase: the concept "art'" with the word "art"', the "source-state" role relation with the phrase "C205 comes from S /NF" (note the embedded references), and so on. The machinery that actually brings about this ~sociation is, of course, much more elaborate, involving three different recta-level networks describing the whole of the original, "domain" network, as well as an explicit representation of the English grammar (i.e. it Ls itsclf expressed in rd,-oN~). role links ~ • ~ test ~ a c t i o n value-.restriction links

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The effect of three vocabulary techniques on the Iranian ESP learners’ vocabulary production

The present study aimed to examine the effect of three vocabulary techniques (dictionary use, etymological analysis, and glossing) on the Iranian ESP learners' vocabulary production. Forty-five university students majoring in architecture at Azad University, Anzali branch,  participated in this study. They were divided into three groups, and each group was randomly assigned to one kind of treat...

متن کامل

EFL Translation Students' Perspective toward Using Bilingual Dictionary in Translation of Polysemous Words

This research presented the use of bilingual dictionary and addressed the EFL translation students' points of view on the use of bilingual dictionary in translating polysemous words (English to Persian). Moreo- ver, it aimed at finding the possible relationship between the effect of using bilingual dictionary by stu- dents in translating polysemous words and their achieved scores. In the study ...

متن کامل

Dictionary of Abstract and Concrete Words of the Russian Language: A Methodology for Creation and Application

The paper describes the first stage of a project on creating an electronic dictionary with numerical estimates of the degree of abstractness and concreteness of Russian words. Our approach is to integrate data obtained from several different sources: text corpora, psycholinguistic experiments, published dictionaries, markers of abstractness (certain suffixes) and a translation of a similar dict...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

An Investigation into Bilingual Dictionary Use: Do the Frequency of Use and Type of Dictionary Make a Difference in L2 Writing Performance?

Bilingual dictionary use in L2 writing test performance has recently been the subject of debate. Opinions differ according to how the trait is understood and whether the system favors the process-oriented or product-oriented views towards the assessment and writing skill. Given the need for more empirical support, this study is aimed at investigating the availability of bilingual dictionary use...

متن کامل

Sentiment Analysis of Social Networking Data Using Categorized Dictionary

Sentiment analysis is the process of analyzing a person’s perception or belief about a particular subject matter. However, finding correct opinion or interest from multi-facet sentiment data is a tedious task. In this paper, a method to improve the sentiment accuracy by utilizing the concept of categorized dictionary for sentiment classification and analysis is proposed.  A categorized dictiona...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1981